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(57) Abstract 

The present invention relates to a method for in vitro cre- 
ation of molecular libraries evolution of protein function. Partic- 
ularly, it relates to variability and modification of protein func- 
tion by shuffling polynucleotide sequence segments. A protein 
of desired characteristics can be obtained by incorporating vari- 
ant peptide regions (variant motifs) into defined peptide regions 
(scaffold sequence). The variant motifs can be obtained from 
parent DNA which has been subjected to mutagenesis to create a 
plurality of differently mutated derivatives thereof or they can be 
obtained from in vivo sequences. These variant motifs can then 
be incorporated into a scaffold sequence and the resulting coded 
protein screened for desired characteristics. This method is ide- 
ally used for obtaining antibodies with desired characteristics 
by isolating individual CDR DNA sequences and incorporating 
them into a scaffold which may, for example, be from a totally 
different antibody. 
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Field of the invention 

"The present invention relates to a method for 'in 
vitro molecular evolution of protein function. 
Particularly, but not exclusively, it relates to the 
shuffling of polynucleotide sequence segments within a 
coding sequence . . 



Background of the invention 

Protein function can be modified and improved in 
vitro by a variety of methods, including site directed 
mutagenesis (Moore et al, 1987) combinatorial cloning 

15 (Huse et al , 1989; Marks et al, 1992) and random 

mutagenesis combined with appropriate selection systems 
(Barbas et al, 1992) . 

The method of random mutagenesis together with 
selection has been used in a number of cases to improve 

20 protein function and two different strategies exist. 

Firstly, randomisation of the entire gene sequence in 
combination with the selection of a variant (mutant) 
protein with the desired characteristics, followed by a 
new round of random mutagenesis and selection. This 

25 method can then be repeated until a protein variant is 

found which is considered optimal (Moore et al, 1996). 
Here, the traditional route to introduce mutations is by 
error prone PCR (Leung et al, 1989) with a mutation rate 
of «0.7%. 

30 Secondly, defined regions of the gene can be 

mutagenized with degenerate primers, which allows for 
mutation rates up to 100% (Griffiths et al, 1994; Yang et 
al, 1995) . The higher the mutation rate used, the more 
limited the region of the gene that can be subjected to 

35 mutations. 

Random mutation has been used extensively in the 
field of antibody engineering. In vivo formed antibody 
genes can be cloned in vitro (Larrick et al, 1989) and 
random combinations of the genes encoding the variable 
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heavy and light genes can be subjected to selection 
(Marks et al, 1992). Functional antibody fragments 
selected can be further improved using random mutagenesis 
and additional rounds of selections (Hoogenboom et al, 
5 1992) . 

The strategy of random mutagenesis is followed by 
selection. Variants with interesting characteristics can 
be selected and the mutagenized DNA regions from 
different variants, each with interesting 

10 characteristics, are combined into one coding sequence 

(Yang et al, 1995). This is a multi-step sequential 
process, and potential synergistic effects of different 
mutations in different regions can be lost, since they 
are not subjected to selection in combination. Thus, 

15 these two strategies do not include simultaneous 

mutagenesis of defined regions and selection of a 
combination of these regions. Another process involves 
combinatorial pairing of genes which can be used to 
improve e.g. antibody affinity (Marks et al, 1992) . 

20 Here, the three CDR-regions in each variable gene are 

fixed and this technology does not allow for shuffling of 
individual CDR regions between clones. 

Selection of functional proteins from molecular 
libraries has been revolutionized by the development of 

25 the phage display technology (Parmley et al, 1987; 

McCafferty et al, 1990; Barbas et al, 1991) . Here, the 
phenotype (protein) is directly linked to its 
corresponding genotype (DNA) and this allows for directly 
cloning of the genetic material which can then be 

30 subjected to further modifications in order to improve 

protein function. Phage display has been used to clone 
functional binders from a variety of molecular libraries 
with up to 10 11 transf ormants in size (Griffiths et al, 
1994) . Thus, phage display can be used to directly clone 

35 functional binders from molecular libraries, and can also 

be used to improve further the clones originally 
selected. 
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Random combination of DNA from different mutated 
clones is a more efficient way to search through sequence 
space. The concept of DNA shuffling (Stemmer, 1994) 
utilises random fragmentation of DNA and assembly of 
fragments into a functional coding sequence* In this 
process it is possible to introduce chemically 
synthesised DNA sequences and in this way target 
variation to defined places in the gene which DNA 
sequence is known (Crameri et al, 1995) . In theory, it 
is also possible to shuffle DNA between any clones. 
However, if the resulting shuffled gene is to be 
functional with respect to expression and activity, the 
clones to be shuffled have to be related or even 
identical with the exception of a low level of random 
mutations. DNA shuffling between genetically different 
clones will generally produce non- functional genes. 

Summary of the invention 

At its most general the present invention provides a 
method of obtaining a polynucleotide sequence encoding a 
protein of desired characteristics comprising the steps 
of incorporating at least one variant nucleotide region 
(variant motif) into defined nucleotide regions (scaffold 
sequence) derived from a parent polynucleotide sequence. 
The new assembled polynucleotide sequence may then be 
expressed and the resulting protein screened to determine 
its characteristics . 

The present method allows protein characteristics to 
be altered by modifying the polynucleotide sequence 
encoding the protein in a specific manner. This may be 
achieved by either a) replacing a specified region of the 
nucleotide sequence with a different nucleotide sequence 
or b) by mutating the specified region so as to alter the 
nucleotide sequence. These specified regions (variant 
motifs) are incorporated within scaffold or framework 
regions (scaffold sequence) of the original 
polynucleotide sequence (parent polynucleotide sequence) 
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which when reassembled will encoded a protein of altered 
characteristics. The characteristics of the encoded 
protein are altered as a result of the amino acid 
sequence being changed corresponding to the changes in 
the coding polynucleotide sequence. 

Rather than modifying a sequence at random and then 
relying on extensive screening for the desired coded 
protein, the present inventors have found it desirable to 
provide a method which modifies selected segments 
(variant motifs) of a protein while maintaining others. 

The variant motifs may be segments of nucleotide 
sequence that encode specified regions of a protein. For 
example, functional regions of a protein (e.g. loops) or 
CDR regions in an antibody. 

The scaffold sequence may be segments of nucleotide 
sequence which it is desirable to maintain, for example 
they may encode more structural regions of the protein, 
e.g. framework regions in an antibody. 

The variant motifs may be nucleotide segments which 
originated from the same polynucleotide sequence as the 
scaffold sequence, i.e. the parent polynucleotide 
sequence, but which have been mutated so as to alter the 
coding sequence from that in the parent. For example, the 
parent polynucleotide sequence may encode an antibody. 
The nucleotide sequences encoding the CDR regions of the 
antibody (variant motifs) may be selected from the 
remaining coding sequence of the parent polynucleotide, 
mutated and then reassembled with scaffold sequence 
derived from the remaining coding sequence . The expressed 
antibody will differ from the wild type antibody 
expressed by the parent polynucleotide in the CDR regions 
only. 

Alternatively, the variant motif may be derived from 
a polynucleotide sequence encoding a protein sequentially 
related to the protein encoded by the parent 
polynucleotide sequence. For example, the CDR regions 
from one antibody (antibody A) may be replaced by the CDR 
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regions of another antibody (antibody B) . 

In each case the resulting expressed protein can be 
screened for desired characteristics. Desirable 
characteristics may be changes in the biological 
properties of the protein. For example, the tertiary 
structure of the protein may be altered. This may affect 
its binding properties, the ability for it to be secreted 
from cells or into cells or, for enzymes, its catalytic 
properties. If the protein is an antibody or part thereof 
it may be desirable to alter its ability to specifically 
bind to an antigen or to improve its binding properties 
in comparison to the parent antibody. 

According to one aspect of the present invention, 
there is provided a method of obtaining a protein of 
desired characteristics by incorporating variant peptide 
regions (variant motifs) into defined peptide regions 
(scaffold sequence), which method comprises the steps of: 

(a) subjecting parent polynucleotide sequence 
encoding one or more protein motifs to mutagenesis to 
create a plurality of differently mutated derivatives 
thereof, or obtaining parent polynucleotide encoding a 
plurality of variant protein motifs of unknown sequence, 

(b) providing a plurality of pairs of 
oligonucleotides, each pair representing spaced-apart 
locations on the parent polynucleotide sequence bounding 
an intervening variant protein motif, and using each said 
pair of oligonucleotides as amplification primers to 
amplify the intervening motif; 

(c) obtaining single -stranded nucleotide sequence 
from the thus-isolated amplified nucleotide sequence; and 

(d) assembling nucleotide sequence encoding a 
protein by incorporating nucleotide sequences derived 
from step (c) above with nucleotide sequence encoding 
scaffold sequence. 

The method may further comprise the step of 
expressing the resulting protein encoded by the assembled 
nucleotide sequence and screening for desired properties. 
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Preferably the parent polynucleotide sequence is DNA 
from which is derived DNA sequences encoding the variant 
motifs and scaffold sequences. 

Preferably the pairs of oligonucleotides are single- 
stranded oligonucleotide primers. One of said pair may 
be linked to a member of a specific binding pair (MSBP) . 
The MSBP is preferably biotin, whose specific binding 
partner could for example be streptavidin. By using the 
specific binding pair the amplified nucleotide sequences 
may be isolated. 

Random mutation can be accomplished by any 
conventional method; but a suitable method is error-prone 

PGR. 

The protein in question could, for example, be an 
antibody or antibody fragment having desirable 
characteristics. Example of antibody fragments, capable 
of binding an antigen or other binding partner, are the 
Fab fragment consisting of the VL, VH, CI and CHI 
domains; the Fd fragment consisting of the VH, and CHI 
domains; the Fv fragment consisting of the VL and VH 
domains of a single arm of an antibody; the dAb fragment 
which consists of a VH domain; isolated CDR regions and 
F(ab')2 fragments, a bivalent fragment including two Fab 
fragments linked by a disulphide bridge at the hinge 
region. Single chain Fv fragments are also included. 

In one approach, after randomly mutating DNA 
encoding the antibody, or a portion of that DNA (eg that 
which encodes the Fab regions or variable regions) , 
oligonucleotide primers could be synthesised 
corresponding to sequences bounding the CDRs (the variant 
motifs) , so that DNA encoding the CDRs are amplified, 
along with any mutations that may have occurred in the 
CDRs . These can be incorporated in the reassembly of the 
antibody coding sequence, using the amplified CDR DNA 
sequences and the unmutated scaffold framework (FR) DNA 
sequences, resulting in the expression of an antibody 
which has a novel combination of CDRs, and potentially 
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screened for in conventional manner. 

In another approach, rather than mutate CDRs and 
reassembling them back into an antibody which will be 
5 closely related to the parent antibody from which the 

CDRs were derived, the CDRs may be taken from one or more 
existing antibodies, but be of unknown sequence. Using 
oligonucleotide primers representing sequences bounding 
the various CDRs, the individual CDRs can be amplified, 

10 isolated and assembled into a predetermined scaffold. 

Of course, combinations of the foregoing approaches 
could be used, with CDRs taken from one or more parent 
antibodies, and assembled into a scaffold to produce a 
completely new, secondary antibody, then, after screening 

15 to obtain a secondary antibody with desired 

characteristics, the DNA encoding it could be mutated, 
the CDRs amplified and isolated, and then reassembled 
with unmutated non-CDR (scaffold) DNA from the secondary 
antibody, to produce variants of the secondary antibody 

20 which are mutated in the CDRs, and which can be screened 

for improved properties with respect to the originally 
selected secondary antibody. 

The present invention allows a novel way for the 
isolation of DNA sequences from genetically related 

25 clones that are functionally different. Genetically 

related clones are those that belong to a particular 
structural class, for example immunoglobulins or alpha- 
beta-barrels. The invention allows for both isolation 
and random combination into a given DNA sequence of 

30 functional sequences from these related clones. These 

functional sequences may be loops that perform binding or 
catalysis . 

The concept of the invention is demonstrated using 
antibody molecules where CDR- regions from different 
35 germline sequences can be isolated and randomly combined 

into a defined framework sequence. The invention expands 
the complexity of the molecular libraries that can be 
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selected using phage display. The concept of the 
invention is also demonstrated by the affinity maturation 
of antibody fragments by the isolation and random 
combination of mutated CDR-regions. 
5 It is not possible to use the DNA shuffling concept 

(Stemmer, 1994) to isolate specific sequences and 
randomly combine these into a given gene sequence, as it 
is not possible to amplify individual DNA regions formed 
in vivo using DNA shuffling. Combination of entire gene 

10 sequences is possible, but here defined regions cannot be 

shuffled. Rather all the DNA is shuffled. Thus, DNA 
sequences from genetically related clones that are 
functionally different, eg proteins that belong to 
structural classes like immunoglobulins or alpha -beta- 

15 barrels, cannot be shuffled in such a way that specific 

regions are kept constant and other regions are shuffled. 

The system provided by the present invention offers 
a simple way to randomly combine functional regions of 
proteins (eg loops) to a defined (specifically selected) 

20 scaffold, ie shuffling of loops to a given protein 

tertiary structure in order to find new protein 
functions. Furthermore, the DNA shuffling technology 
introduces mutations at a rate of 0.7% (Stemmer, 1994), 
Thus, the known DNA shuffling technology (Stemmer, 1994) 

25 does not allow for shuffling of unmutated regions, since 

the process itself introduces mutations at random 
positions, including the scaffold regions. 

In contrast, the invention allows for mutagenesis of 
defined DNA-sequences together with shuffling and 

30 assembly of these pieces of DNA into a coding region, and 

will allow for mutagenesis of defined regions and 
subsequent selection of these regions in combination. 

The invention allows for different regions of DNA 
from different sequences (clones) to be shuffled and 

35 randomly combined. This increases the genetic variation 

from which functional antibody fragments are selected and 
will thus increase the probability of selecting proteins 
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with the desired characteristics. It can be realised that 
by randomly shuffling as few as a hundred CDRs at each 
position in the VH and VL of an fragment, as many as 10 X2 
combinations may be obtained thereby extending the 
variability normally found in the immune system. 

The invention provides amplification of defined 
regions from eg a cDNA library using two primers of which 
one is biotinylated. Using the MSBP, e.g. biotin, group, 
single stranded DNA can be isolated and used in the gene 
assembly process. The present inventors have 
demonstrated this with the amplification of diverse CDR 
regions from an antibody gene library and the combination 
of these CDR regions randomly to a given framework 
region. Thus, defined regions of DNA (framework regions) 
can be interspaced by random regions of DNA (CDR 
regions) , which have an in vivo origin or can be 
chemically synthesized. 

The present invention also provides polynucleotide 
sequences and the proteins they encoded produced by the 
method described above. There is also provided vectors 
incorporating the polynucleotide sequences and host cell 
transformed by the vectors. 

The present invention also provides a polynucleotide 
library comprising polynucleotides created by the method 
described above which may be used for phage display. 

Aspects and embodiments of the present invention 
will now be illustrated, by way of example, with 
reference to the accompanying figures. Further aspects 
and embodiments will be apparent to those skilled in the 
art. All documents mentioned in this text are 
incorporated herein by reference. 

Brief description of the drawings 

Figure 1 shows shuffling of specific DNA sequences 
between different clones, based on the assembly of gene 
sequences from a set of overlapping oligo-nucleotides 
following a one-step PCR protocol. 
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Figure 2 shows different dissociation rate constants 
for different CDR-shuffled clones. A low bar represents 
slow dissociation-rate, a high bar represents a fast 
dissociation-rate. Clone 36 is the original non-mutated 
antibody fragment. 

Figure 3 shows the results of affinity purified scFv 
antibody fragment assayed on HPLC, Superose S-2 00 FPLC- 
column (Pharmacia) in PBS buffer. Peak 1 is the monomeric 
form of the antibody fragment, peak 2 is a small amount 
of impurity and peak 3 is NaN3 (sodium azid) , used as a 
preservative. 

Figure 4 shows a schematic representation of 
amplification of defined sequences of DNA and the 
shuffling of these into a master framework. Only the CDR 
regions are amplified. Figure 4A: Assembly of genes for 
the VH-domain. The template is scFv-Bll mutated with 
error prone PGR. An individual CDR is amplified using two 
primers adjacent to the particular CDR and one of these 
primers is biotinylated at the 5' end. The individual CDR 
is amplified and double- stranded DNA (dsDNA) is produced 
with the mutations focused to the CDR since the two 
amplification primers do not contain any mutations. This 
DNA is separated into two single stranded DNA molecules. 
The molecule without biotin is used in gene assembly. 
Primers 725, 729, 730, 728, 727 are synthesized in a DNA 
synthesizer and primers H2, H3, H5 contain mutated CDR 
and are amplified as above. Figure 4B: Assembly of genes 
for the VL-domain. CDRs are amplified in the same way as 
in A. Primers 759, 738, 745, 744, 880 are synthesized in 
a DNA synthesizer and primers L2, L3, L5 contain mutated 
CDR and are amplified as above. 

Figure 5 shows the alignment of the peptide 
sequences for clones 3, 11 and 31 with the original non- 
mutated antibody fragment (wt) . The CDR-regions are 
marked. Mutations in clones 3, 11 and 31 are underlined. 

Figure 6 shows the principles for the isolation of 
single-stranded DNA for the shuffling of defined DNA 
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regions . 

Figure 7 shows the length of CDR3 heavy chain from 
different clones. These CDR regions have been amplified 
from different germline sequences and randomly cloned to 
a defined framework region (from DP-47 sequence) . 

Figure 8 shows a schematic representation of 
amplification of defined sequences of DNA and the 
shuffling of these into a master framework. All the 
oligonucleotides used in the gene assembly are amplified 
by PCR, but only the CDR regions contain any genetic 
variation. Figure 8A: Assembly of genes for the VH- 
domain. The template for the framework region 
amplification is scFv-Bll, whereas CDRs are amplified 
from cDNA prepared from peripheral blood lymphocytes, 
tonsils and spleen. An individual DNA fragment is 
amplified using two primers located at the ends of the 
fragments to be amplified and one of these primers is 
biotinylated at the 5' end. The individual DNA fragment 
is amplified and double -stranded DNA (dsDNA) is produced. 
This DNA is separated into two single stranded DNA 
molecules. The molecule without biotin is used in gene 
assembly, i.e. primers HI, H4 , H6 and these primers 
contain no variation. Primers HCDR1, HCDR2, HCDR3 contain 
different CDR and are amplified using two primers 
adjacent to the particular CDR and one of these primers 
is biotinylated at the 5' end. The individual CDR is 
amplified and double- stranded DNA (dsDNA) is produced 
with the variation focused to the CDR since the two 
amplification primers do not contain any mutations. This 
DNA is separated into two singled stranded DNA molecules 
and used in gene assembly of VH domain in a library 
format, i.e. the variation in the CDRs is derived from 
different germ-line sequences. Primers BT25 and BT26 are 
synthesized in a DNA- synthesizing machine. Figure 8B: 
Assembly of genes for the VL-domain. In principle the 
same procedure as in A. Primers LI, L4, L6 are amplified 
and produced by PCR and contain no variation. LCDR1, 
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LCDR2 , LCDR3 contain different CDR. Primers BT7 and BT12 
are synthesized in a DNA- synthesizing machine. 

Figure 9 shows the variation in a library 
constructed according to Fig. 8. The scFv region of 
5 library clones and original scFv-Bll, binding to FITC 

(f luorescein-iso-thiocyanate) was synthesized by PGR. 
Purified PCR products were cut with BstNI and separated 
on a 2.5% agarose gel. Clones 1-15 are in lane 2-16, 
clones 16-29 are in lane 18-31. Original scFv-Bll is in 

10 lane 32. Analysis revealed that 28 clones could be sorted 

in 13 different groups according to restriction pattern 
and fragment size. Eight clones (1, 2, 8, 10, 12, 16, 26, 
27) were unique, 2 clones (17, 24) appeared similar, 1 
group of clones (18, 23, 29) had 3 similar members, 2 

15 groups (5, 15, 14, 19) and (3, 4, 6, 11) had 4 members 

and 1 group (7, 9, 13, 20, 21, 22, 25) had 7 similar 
members. This experiment underestimates the variation in 
the library since BstNI detects only a fraction of 
sequence variability. In addition, the gel resolution did 

20 not allow the detection of minor size differences and did 

not resolve fragments below 100 bp. 
Figure 9B shows clones showing similar restriction 
pattern in the experiment exemplified in Figure 9A cut by 
both BstNI and BamHI and separated on 3% agarose gels. To 

25 facilitate comparison, the groups of similar clones 

described in experiment A were put together on the gels. 
Clone 8 and 28 from experiment A were excluded due to 
space limitations. 

Gel I) Lane 1-8; standard, clone 5,15,14,19,2,27, 
30 original scFv-Bll, respectively 

Gel II) Lane 1-8; standard, clone 16,17,24,18,23,29,26, 
respectively 

Gel III) Lane 1-8; standard, clone 7,9,13,20,21,22,25, 
respectively 

35 Gel VI) Lane 1-8; standard, clone 3,4,6,11,1,10,12, 

respectively 

Under these improved experimental conditions, 
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essentially all clones had different restriction 
patterns/fragments sizes. All clones were different from 
the original scFv-Bll gene (lane 8, gel 1) . Moreover, the 
groups of clones which appeared similar in Figure 9A were 
5 found to be different as analyzed in Figure 9B. See clone 

5,15,14,19 (lanes 2-5 gel I), clone 17,24 (lanes 3-4 gel 
II), clone 18,23,29 (lanes 5-7 gel II), clones 
7,9,13,20,21,22,25, (lanes 2-8, gel III) and clones 
3,4,6,11 (lanes 2-5 gel IV). 
10 In conclusion, these experiments suggest that the 

library contains high variability. 

Detailed description and exemplification of the invention 
One aspect of the DNA shuffling procedure can be 
15 illustrated by the following steps in Fig 1. 

A: A gene coding for a protein of interest is 

divided into overlapping oligonucleotides. 

B: The oligonucleotides are assembled using PCR into 

a full length gene - sequence . 
2 0 C: The gene sequence is subjected to mutagenesis, eg 

by error-prone PCR. 

D: Pairs of oligonucleotides are synthesized, each 

pair covering a region defined by one of the 

oligonucleotides in step A above, except for a region 
25 located in the middle of the step A oligonucleotide. 

This uncovered region is the DNA sequence that can be 

shuffled after PCR amplification. These two synthesised 

oligonucleotides can thus be used as amplification 

primers to amplify the uncovered region. 
30 E: One of these amplification primers is 

biotinylated and the double- stranded PCR product can then 

be isolated using well-known strepavidin systems. 

F: From the thus isolated amplified oligonucleotides 

can be obtained a single-stranded DNA sequence containing 
35 DNA from the uncovered region mentioned above, which can 

then be used as oligo-nucleotide in a new assembly of the 

gene sequence as described in step A. 
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G: If DNA sequences from different clones and from 
different regions of the mutated gene sequence are 
amplified and made single -stranded, they will combine 
randomly in the PCR process of gene assembly. This 
random combination is the basis for in vitro molecular 
evolution. 

Examples 

The present inventors have demonstrated the concept 
of shuffling of defined DNA in different experimental 
settings. Firstly, the shuffling of in vitro mutated CDR 
regions in an antibody fragment for affinity maturation 
purposes (example 1 and 2) is exemplified and secondly 
che shuffling of in vivo formed CDRs for creation of a 
highly variable antibody library (example 3 and 4) is 
exemplified, 

1 . Affinity maturation 

A model system was developed, based on the scFv-Bll 
antibody fragment which binds to FITC. The full-length 
gene encoding this scFv was assembled from a set of 12 
oligonucleotides (Fig. 4A and Fig. 4B) representing the 
known DNA sequence of the scFv-Bll, and the functional 
binding of the gene product to FITC could be verified. 
This gene sequence was then mutagenised using error-prone 
PCR, and the DNA encoding the CDR regions were amplified 
as described above, using the amplification primers, one 
of which is biotinylated. (The CDR regions are the parts 
of the antibody molecule involved in binding the antigen, 
in this case FITC) . 

All six CDR regions were amplified and a new gene 
was assembled using six oligonucleotides selected from 
the first assembly of 12 oligonucleotide (see above) 
(these were not mutagenized) and six from the 
amplification of mutagenized CDR regions. Selection of 
functional antibody fragments that bound FITC was carried 
out using phage display. 50% of the clones bound FITC 
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10 



15 



20 



with different dissociation-rates than did the original 
scFv-Bll, as measured in the BIAcore biosensor (Figure 
2) . This demonstrates that the clones were changed in 
the way they recognized FITC. 

Of the 16 clones identified to bind FITC in BIAcore 
(Figure 2) clones 3, 11, 27 and 31 were chosen to be 
analyzed in more detail as these clones exhibited the 
larger changes in off -rates. These clones were expressed 
and affinity-purified on a column conjugated with FITC- 
BSA and eluted with a low pH buffer. The purified scFv- 
antibody fragments were further purified and analyzed 
with HPLC, using a Pharmacia Superdex 2 00 FPLC column 
with the capacity to separate the monomeric and dimeric 
form of the antibodies. In all clones the monomeric form 
dominated (typical size profile is shown in Figure 3) . 
This was then purified and used in detailed analysis of 
affinity using a BIAcore biosensor (Table 1) . 

Table 1. 

Affinity determination of selected . 



Clone 



(M" 1 s" 1 ) k 



"ASS 



(s" 1 ) 



DISS 



25 



#3 

#11 
#27 
#31 

(FITC-B11 original) 



2,0 X 10 5 

2.6 X 10 s 
5,0 x 10 s 
1,2 x 10 5 

2.7 x 10 s 



4,3 x 10" 3 

3.3 x 10~ 3 
16,0 x 10' 3 

5.4 x 10" 3 
9,7 x 10" 3 



4,8 x 10 7 

7,8 x 10 7 

3,1 x 10 7 

2,1 x 10 7 

2,8 x 10 7 



Clone #11 exhibited an affinity 2.8 times higher 
30 than the original scFv-Bll antibody fragment. This 

increase is based on a slower off -rate. One clone (#27) 
showed 2 times increase in association-rate. However, the 
overall affinity of this clone was similar to the 
original FITC-B11 clone due to a faster dissociation - 
35 rate. The distribution of different association and 

dissociation-rates among the clones was considered a 
source for CDR-reshuf fling for further improvement of 
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affinities . 

Three clones were sequenced. In the VH region (ie 
half of the scFv-Bll and carrying three CDR regions) the 
mutations found were all in the CDR regions as expected, 
since these were the only regions mutagenized and 
amplified using the amplification primers. Interestingly, 
all the CDR regions were different and carried different 
mutations (Figure 5) . However, in the case of CDR region 
2, the same mutation was found (a tyrosine to histidine 
substitution) in all 3 clones (the rest of CDR regions 
differed between the clones) . 

Furthermore, the mutation rates were found to be in 
between 2% and 4%, as determined from the base changes in 
the 90 bp long sequence built up from three CDR regions 
together. This is more than the error-prone PCR mutation 
rate, and indicates that there is combination of 
individual CDR regions from different clones. 

2 . Affinity maturation-reshuffling 

In order to perform a second shuffling 
(reshuffling) , clones selected for their binding affinity 
to FITC were used in an additional round of CDR- 
amplif ication and library construction. In theory, the 
reshuffled library will contain mutated shuffled CDR- 
regions, selected for improved binding to FITC. In this 
way, new combinations of CDR- regions, improved with 
respect to binding, could be constructed and the library- 
sub jected to selection for binders with improved 
affinities . 

The pool of all clones obtained from the selection 
procedure (as detailed in example 1) were used as 
template for CDR amplifications. One amplification was 
carried out for each CDR using primers listed in Table 2. 

Table 2 

Sequences for primers used in CDR-shuf fling. 
B=Biotin labeled 5' primer 
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CDR Reamplif ication Primers 



764 


5' 


B-GTC CCT GAG ACT CTC CTG TGC AGC CTC TGG ATT CAC 


CTT 


T 


3' 


875 


5' 


TCC CTG GAG CCT GGC GGA CCC A 3' 








o / o 




CGC CAG GCT CCA GGG AAG GGG CTG GAG TGG GTC TCA 3' 








765 


5' 


B-GGA ATT GTC TCT GGA GAT GGT GAA 3' 








799 


5' 


GAG CCG AGG ACA CGG CCG TGT ATT ACT GTG CAA GA 3' 








766 


5' 


B-GCG CTG CTC ACG GTG ACC AGG GTA CCT TGG CCC CA 3' 






767 


5' 


B-AGC GTC TGG GAC CCC CGG GCA GAG GGT CAC CAT CTC 


TTG 


T 


3' 


800 


5' 


GGG CCG TTC CTG GGA GCT GCT GGT ACC A 3' 








801 


5' 


GCT CCC AGG AAC GGC CCC CAA ACT CCT CAT CTA T 3' 








768 


5' 


B-GAC TTG GAG CCA GAG AAT CGG TCA GGG ACC CC 3' 








802 


5' 


CTC CGG TCC GAG GAT GAG GCT GAT TAT TAC TGT 3' 








769 


5' 


B-CGT CAG CTT GGT TCC TCC GCC GAA 3' 









Framework VH 

727 5' CCG CCG GAT CCA CCT CCG CCT GAA CCG CCT CCA CCG CTG CTC 
ACG GTG ACC A 3' 

728 5 'GAC CGA TGG ACC TTT GGT ACC GGC GCT GCT CAC GGT GAC CA 3' 

729 5' GAG GTG CAG CTG TTG GAG TCT GGG GGA GGC TTG GTA CAG CCT 
GGG GGG TCC CTG AGA CTC TCC TGT 3 ' 

73 0 5' GGC CGT GTC CTC GGC TCT CAG GCT GTT CAT TTG CAG ATA CAG 
CGT GTT CTT GGA ATT GTC TCT GGA GAT GGT 3' 

Framework VL 

738 5' CAG TCT GTG CTG ACT CAG CCA CCC TCA GCG TCT GGG ACC CCC 
G 3' 

744 5' ACT AGT TGG ACT AGC CAC AGT CCG TGG TTG ACC TAG GAC CGT 
CAG CTT GGT TCC TCC GC 3' 

745 5' CTC ATC CTC GGA CCG GAG CCC ACT GAT GGC CAG GGA GGC TGA 
GGT GCC AGA CTT GGA GCC AGA GAA TCG 3' 

1129 5' CAG GCG GAG GTG GAT CCG GCG GTG GCG GAT CGC AGT CTG TGC 
TGA CTC AGC CAC CCT CAG CGT CTG GGA CCC CCG 3' 

Amplification primers VH/VL Assembly 

1125 5' ACT CGC GGC CCA ACC GGC CAT GGC CGA GGT GCA GCT GTT GGA 
G 3' 

1126 5' CAA CTT TCT TGT CGA CTT TAT CAT CAT CAT CTT TAT AAT CAC 
CTA GGA CCG TCA GCT TGG T 3' 
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The amplification was peformed according to 
following parameters: 100 ng template (1.6 x 10 8 CFU 
5 bacteria grown for 6 h) , 60 pmol each primer, 5 Units 

PFU polymerase (Stratagene) , 1 x PFU buffer, 500 (xM 
dNTPs, reaction volume 100 /xl, preheat 96°C for 10 
minutes, 96°C for 1 minute: 68°C for 1 minute: 72°C for 1 
minute for 25 cycles, 72 °C for 10 minutes. This procedure 
10 was essentially the same as for CDR amplification in 

Example 1- The amplified CDR were used for assembly into 
VH and VL encoding sequence according to Figure 1, 4A, 4B 
and Table 3 . 

Table 3 

15 

PGR parameters for the assembly of VH and VL gene 

sequences in CDR-shuf fling 



VL VH 
20 



Primer 


759 


Primer 


725 


3 0 pmol 


Primer 


738 


Primer 


729 


0 . 6 pmol 


Primer 


L2 


Primer 


H2 


0 . 6 pmol 


Primer 


L3 


Primer 


H3 


0 . 6 pmol 


Primer 


745 


Primer 


730 


0 . 6 pmol 


Primer 


L5 


Primer 


H5 


0 . 6 pmol 


Primer 


744 


Primer 


728 


0 . 6 pmol 


Primer 


880 


Primer 


727 


3 0 pmol 


Taq 




Taq 




10 Units 


dNTPs 




dNTPs 




200 


lx Taq 


buffer 


lx Taq 


buffer 


to 100 fil 



Preheat 95° 10 minutes, 20 cycles: 95° 1 minutes, 68° 1 
minutes, 72° 1 minutes 72° 10 minutes. 

35 ; 

The VH and VL were then assembled into a scFv 
encoding sequence according to standard procedures 
(Griffiths et al 1994) . The resulting library was 
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10 



20 



25 



30 



subjected to panning so as to select binders with 

affinities to F1TC. The selection procedure or 
Z reshuffled library was essentially the -- as for 
he initially shuffled library. The total 

from the shuffled library (Table 4) . 



Table_4 



. H „ raCes of individual clones selected from the 
"■STZ (d— *> -d fro. tne Ruffled 



library (clones B) . 



Clone 



scFv-Bll (original) 12.5 



1A 

12A 

13A 

14A 

16A 

17A 

22B 

3 IB 

32B 

33B 

34B 

35B 



6.3 

5.7 

9.0 

9.7 

1.8 

7.9 

0.2 

0.3 

9.8 

6.8 

7.3 

8.7 



35 
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3. Cloning and shuffling of defined DNA reg ions 

In our system it is possible to amplify defined 
regions from a cDNA library using two primers of which 
one is biotinylated. Using the biotin group, single 
stranded DNA can be isolated an used in the gene assembly 
process (Figure 6) . We have demonstrated this with the 
amplification of diverse CDR regions from an antibody 
gene library and the combination of these CDR regions 
randomly to a given framework region. Thus, defined 
regions of DNA (framework regions) can be interspaced by 
random regions of DNA (CDR regions) which have an in vivo 
origin (Table 5) . The CDR3 region vary in size (Figure 
7) . Alternatively, these regions could be chemically 
synthesised. 

Table 5 



Combination of CDR regions from different germline sequences 
transplanted to the DP- 47 framework encoding the variable heavy 
domain. For CDR1 and CDR2 the suggested germline origin is indicate. 
For CDR3 the number of residues in the CDR-region is written. N.D * 
not determined. 



Clone 


CDR1 


CDR2 


CDR3 


1 


DP-35 


DP-42 


12 


2 


DP-49 


DP-53 


13 


3 


N.D. 


DP-51 


11 


4 


DP-32 


DP-47 


10 


5 


DP-41 


DP-47 


8 


6 


DP-32 


DP-77 


9 


7 


DP-31 


DP-47 


7 


8 


DP-49 


DP-35 


5 


9 


DP-49 


DP-35 


N.D. 


10 


DP-48 


DP-48 


N.D. 


11 


DP-51 


DP-47 


10 


12 


DP-34 


DP-31 


N.D . 
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13 


DP-85 


DP-53 


4 


14 


DP-31 


DP-77 


10 


15 


DP-34 


DP-53 


4 



4. Library construction* 

A gene library was constructed encoding scFv 
antibody fragments. The strategy used for this library is 
based on the assembly of a set of oligonucleotides into a 
sequence encoding VH and VL antibody domains (Figure 8A, 
8B.) Native in vivo formed CDR regions can be shuffled 
and assembled into a given master framework. In this 
example we have developed this concept further and 
assembled both VH and VL encoding gene sequences with 
native CDR regions into a given master framework. Thus, 
all six CDR positions have been shuffled. The template 
origin for CDR amplification was cDNA from peripheral 
blood B -cells, spleen, tonsills and lymphnodes . 
Oligonucleotides encoding the framework regions have also 
been amplified using the strategy with two flanking 
primers, where one is biotinylated (primers LI, HI L4 , H4 , 
L6, H6) . The primers used are described in Table 6 and in 
Figure 8A, 8B. 

Table 6 

Sequences for primers used in library construction, 
B = Biotin labeled 5' primer 



Amplification of framework fragments 

BTl. 5' ACA GTC ATA ATG AAA TAG CTA TTG C 3' 

BT2. 5' B-GC ACA GGA GAG TCT CA 3' 

BT3 , 5' B-CA CCA TCT CCA GAG ACA ATT CC 3' 

BT4 , 5' GGC CGT GTC CTC GGC TCT 3' 

BT5 * 5' B-TG GTC ACC GTG AGC AGC 3' 

BT6 . 5' CCG CCG GAT CCA CCT 3' 

BT7 ♦ 5' CAG GCG GAG GTG GAT CCG GC 3' 



WO 98/32845 



PCT/GB98/00219 . 



22 

BT8. 5' B-CG GGG GTC CCA GAC GCT 3' 
BT9. 5' B-CG ATT CTC TGG CTC CAA GT 3' 
BTIO. 5' CTC ATC CTC GGA CCG GA 3' 
BT11. 5' B-TC GGC GGA GGA ACC AAG CT 3' 
BT12 5' TGG CCT TGA TAT TCA CAA ACG AAT 3' 

Amplification of in vivo CDR 

B-TC CCT GAG ACT CTC CTG TGC AGC CTC TGG ATT CAC CTT 3' 
10 BT14. 5' TTC CCT GGA GCC TGG CGG ACC CA 3' 

B-GG AAT TGT CTC TGG AGA TGG TGA A 3' 
5' GTC CGC CAG GCT CCA 3' 

B-CG CTG CTC ACG GTG ACC AGT GTA CCT TGG CCC CA 3' 



20 



30 



BT13 . 


5' 


BT14. 


5' 


BT15. 


5' 


BT16 . 


5' 


BT17. 


5' 


BT18 . 


5' 


BT19. 


5' 


BT20 . 


5' 


BT21. 


5' 


BT22 . 


5' 


BT23 - 


5' 


BT24 . 


5' 



15 BT19. 5' B-AG CGT CTG GGA CCC CCG GGC AGA GGG TCA CCA TCT CTT 3' 



B-GA CTT GGA GCC AGA GAA TCG GTC AGG GAC CCC 3' 



Assembly of VH and VL 

BT25. 5' B-TA CCT ATT GCC TAC GGC AGC CGC TGG ATT GTT ATT ACT CGC GGC 
25 CCA GCC GGC CAT GGC CGA 3' 

BT26. 5' CCG CCG GAT CCA CCT CCG CCT GAA CCG CCT CCA CCG CTG CTC ACG 
GTG ACC A 3' 



Amplification primers 2 na assembly 

BT27. 5' B-TGG CCT TGA TAT TCA CAA ACG AAT 3' 
BT28. 5' B-ACG GCA GCC GCT GGA TTG 3' 



The PCR parameters for CDR and framework region 
35 amplification were essentially the same as described in 

example 2 . The PCR parameters for assembly of genes 
encoding VH and VL are described in Table 7. 



40 
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Table 7 



PCR parameters for the assembly of VH and VL gene 
sequences for library construction. 



VH 




VL 


Primer 


BT25 


Primer 


Primer 


HI 


Primer 


Primer 


HCDR1 


Primer 


Primer 


HCDR2 


Primer 


Primer 


H4 


Primer 


Primer 


HCDR5 


Primer 


Primer 


H6 


Primer 


Primer 


BT26 


Primer 


Taq 




Taq 


dNTPs 




dNTPs 


lx Taq 


buffer 


lx Taq 



BT7 


30 


pmol 


LI 


0.6 


pmol 


LCDR1 


0.6 


pmol 


LCDR2 


0.6 


pmol 


L4 


0.6 


pmol 


LCDR3 


0.6 


pmol 


L6 


0.6 


pmol 


BT12 


30 


pmol 




10 


Units 




200 


lM 


buffer 


to 


100 fil 



Preheat 95° 10 minutes, 20 cycles: 95° 1 minutes, 68° 1 
minutes, 72° 1 minutes and 72° 10 minutes. 



The assembled VH and VL gene sequences were 
assembled into a scFv coding sequence using standard 
protocols (Griffiths et al 1994). A library of 1.1 x 10 9 
members were constructed out of the 4 0 clones tested all 
4 0 contained an insert of the right size as determined by 
PCR agarose gel electrophoresis. In order to test the 
variability in the library, PCR amplified and purified 
inserts were subjected to cleavage by BsTNl and BamHl . 
Clones showed different restriction patterns, as 
determined by agarose gel electrophoresis and compared to 
the control scFv-Bll (Figure 9) . 

In order to estimate the frequency of clones able to 
express scFv antibody fragments, clones from the library 
containing the FLAG sequence (Hopp et al, 1989), as well 
as control bacteria with and without FLAG sequence, were 
plated at low density on Luria broth-plates containing 
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100/zg/ml ampicillin, 25/zg/ml tetracycline and 1% glucose. 
The plates were grown at 37°C over night and lifted to 
nitrocellulose filters by standard methods (Satnbrook et 
al 1989) . In order to induce synthesis of the scFv genes 
in the bacteria, filters were incubated for 4hrs on 
plates containing 0.5mM isopropyl- thio-/3-D-galactoside 
(IPTG) but without glucose. Bacteria were then lysed by 
lyzosyme /chloroform treatment, the filters were washed 
and incubated with anti-FLAG M2 antibody (Kodak) followed 
by anti-mouse peroxidase conjugated second antibody (P260 
Dakopatts) and detected by DAB 3 , 3 ' -diaminobenzidine 
tetrahydroklorid, Sigma) (Table 8) . 

Table 8 

Frequency of intact antibody genes in the library 



Library Pool 


Tested clones 


FLAG positive 
clones 


Percent positive 
clones 


A 


145 


88 


60 


B 


77 


52 


67 


c 


1S8 


105 


66 


D 


68 


48 


70 


All library 
pools 


448 


293 


65.4 


Positive control 

pFABScHis 

scFvBll 


64 


64 


100 


Negative control 
pFABScHis 


30 


0 


0 



The anti-FLAG antibody detects a FLAG sequence 
situated downstream of the scFv gene in the library 
constructs as well as in the control vector pFABScHis 
5 scFvBll, but not in the original vector pFABScHis. 

Clones, to which the anti-FLAG antibody binds, therefore 
contains an intact open reading frame of the scFv gene . 
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Claims 

1. A method of obtaining a polynucleotide sequence 
encoding a protein of desired characteristics by 

5 incorporating variant peptide regions (variant motifs) 
into defined peptide regions (scaffold sequence) , 
comprising the steps of 

a) subjecting a parent polynucleotide sequence 
encoding one or more protein motifs to mutagenesis to 

10 create a plurality of differently mutated derivatives 
thereof, or obtaining a parent polynucleotide encoding 
one or more variant protein motifs; 

b) providing a plurality of pairs of 
oligonucleotides, each pair representing spaced apart 

15 locations on the parent polynucleotide sequence bounding 
an intervening variant protein motif, and using each said 
pair of oligonucleotide as amplification primers to 
amplify the intervening motif; 

c) obtaining single-stranded nucleotide sequences 
20 from the thus-isolated amplified nucleotide sequences; 

and 

d) assembling polynucleotide sequences encoding a 
protein by incorporating nucleotide sequences derived 
from step c) above with nucleotide sequence encoding 

25 scaffold sequences. 

2. A method according to claim 1 further comprising 
the step of expressing the resulting protein encoded by 
the assembled polynucleotide sequence and screening for 

30 desired properties. 

3. A method according to claim 1 or claim 2 wherein the 
oligonucleotides are single stranded. 



35 



4 . A method according to any one of the preceding 
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claims wherein one of said pair of oligonucleotides is 
linked to a member of a specific binding pair (MSBP) . 

5. A method according to claim 4 further comprising the 
5 steps of isolating the amplified variant motif by binding 

the MSBP to its specific binding partner. 

6. A method according to claim 4 or claim 5 wherein the 
MSBP is biotin. 

10 

7. A method according to claim 7 wherein the specific 
binding partner is streptavidin . 

8 . A method according to any one of the preceding 
15 claims wherein the parent polynucleotide sequence is" 

subjected to error-prone PGR. 

9. A method according to any one of the preceding 
claims wherein the parent polynucleotide sequence encodes 

20 an antibody or part thereof. 

10. A polynucleotide sequence encoding a protein of 
desired characteristics obtained by the method according 
to any one of claim 1 to 9. 

25 

11. A polynucleotide sequence according to claim 10 
wherein the protein is an antibody or fragment thereof. 

12. A vector comprising a polynucleotide sequence 
30 according to claim 10 or claim 11. 

13. A host cell transformed with the vector of claim 12. 

14. A method of producing a polypeptide of desired 
35 characteristics comprising culturing the host cell of 
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claim 13 so that the polypeptide is produced . 

15. A method according to claim 14 comprising the 
further step of recovering the polypeptide produced. 

5 

16. A polynucleotide library comprising polynucleotide 
sequences according to claim 10 or claim 11. 

17. A protein having desired characteristics obtained by 
10 the method according to any one of claim 1 to 9. 

18. A method of creating a polynucleotide library 
comprising the steps of subjecting a parent 
polynucleotide sequence encoding one or more protein 

15 motifs to mutagenesis to create a plurality of 

differently mutated derivatives thereof, or obtaining a 
parent polynucleotide encoding one or more variant 
protein motifs; 

b) providing a plurality of pairs of 

20 oligonucleotides, each pair representing spaced apart 

locations on the parent polynucleotide sequence bounding 
an intervening variant protein motif, and using each said 
pair of oligonucleotide as amplification primers to 
amplify the intervening motif; 

25 c) obtaining single-stranded nucleotide sequences 

from the thus-isolated amplified nucleotide sequences; 

d) assembling polynucleotide sequences by 
incorporating nucleotide sequences derived from step c) 
above with nucleotide sequence encoding scaffold 

3 0 sequence; and 

e) inserting said polynucleotide sequences into 
suitable vectors. 

19. A method according to claim 18 further comprising 
35 the step of screening the library for a protein of 
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desired characteristics. 
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Fig.1. 

Shuffling of defined regions of DNA 



Synthesize overlapping 
oligonucleotides that 
cover an entire gene 
sequence 



A 



Assembly of oligonucleotides 
into full gene sequences by PGR 



B 



Mutagenesis 



Clone 1 




Clone 2 



Amplification of 
mutated area 



BIOTIN Clone 1 



D 




Clone 2 




BIOTIN 



BIOTIN 





BIOTIN 



Separation of 
strands 



Assembly with oligonucleotides 
from A and from E 



G 



Result: Defined region of the 
gene can be mutagenized. Defined 
region from different clones can 
be shuffled, which allows for in 
vitro protein evolution 
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Fig.2 
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^ Fig.6. 

^Preparation of single-stranded DNA using Affini-TipJ 
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